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jy^ . Let X be a continuous-time Markov chain in a finite set /, let h be a mapping of / onto 

another set, and let Y be defined by Y t — h(X t ), (t > 0). We address the filtering problem 
for X in terms of the observation Y, which is not directly affected by noise. We write down 
explicit equations for the filtering process II t (i) = P(X t = i \ y°), (i G I,t> 0), where (y°) 
Ph is the natural filtration of Y. We show that II is a Markov process with the Feller property. 

We also prove that it is a piecewise-deterministic Markov process in the sense of Davis, and 
we identify its characteristics explicitly. We finally solve an optimal stopping problem for 
X with partial observation, i.e. where the moment of stopping is required to be a stopping 
time with respect to (3^°). 
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In the classical formulation of the filtering problem in continuous time the basic datum is a 
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pair of stochastic processes (X t )t>o and (Y t )t>o, defined on some probability space (CI, J 7 , 1 
with values in measurable spaces (1,1) and (0,0) respectively. X is called the unobserved (or 
Q^ ■ signal) process and Y the observation process. The filtering process is defined as 

O 

O. U t (A) = F(X t e A\y°), Ael,t>0, 



where = a(Y s , s £ [0,t]) are the a-algebras of the filtration generated by the observation 
process. The filtering problem consists in a description of the measure-valued process n and its 



$_i ■ properties. Often, II is shown to satisfy some differential equations, called the filtering equations, 



and in several cases it can be characterized as the unique solution of such equations. 

Various kinds of unobserved processes have been considered in the literature. X is often 
taken to be a diffusion process solution of a stochastic differential equations driven by the 
Wiener process. The case of X being a Markov chain, or more generally a marked point process, 
is also frequently addressed. 

Concerning the observation process, the large majority of cases considered in the literature 
are variants of the following situation: Y takes values in the euclidean space O = W 71 and has 
the form ^ 

Y t = I H(X s )ds + aW t , t>0, (1.1) 
Jo 

where W is a standard Wiener process in M m , H : I — > M m is a given function and a is a 
constant, a ^ 0. 

As is well known, special results are available in the linear gaussian case, the so called 
Kalman-Bucy filter. 
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Recently, the following different model has been addressed by several authors: 

Y t = h(X t ), t > 0, (1.2) 

where h : I — > O is a given function. We call Y a noise-free observation, since it is not directly 
affected by noise but rather all sources of randomness are included in the unobserved process X 
(in [14], Y is called perfect observation). 

A basic motivation for studying noise- free observations arises in connection with the following 
variant of (jl.ip 

Y t = [ H{X s )ds+ ( a{X s )dW s , t>0, (1.3) 
Jo Jo 

where a state-dependent diffusion coefficient a occurs. It can be proved that this model can be 
converted into another one where the observation process has two components: one is similar to 
the traditional model (jl.ip and the other is noise-free, i.e., of the form ()1.2|) : see [19] . [15] , [8]. 

Noise-free observations have also been considered in connection with the classical topic of the 
filter stability. It has been discovered that several ergodicity properties of the filtering process 
II fail to hold when the observation is noise-free: see for instance [2] and the discussion and 
references therein; see also our remark \3. 51 

Another motivation, pointed out in [14], is that a clear picture of the noise- free case may 
also lead to a better understanding of the limiting behavior of the model (jl.ip as a —> 0. 

Beside these motivations, it is our opinion that the case of noise- free observation deserves 
attention in its own. The few existing results are already mathematically interesting. More 
important, it seems to be the natural mathematical model to describe situations where the 
available observations are indeed very accurate. For instance, when randomness is introduced 
in order to represent uncertainty about the state of an evolving dynamical system it might be 
unnatural to introduce a noise affecting the observation in order to fit the standard framework 
(jl.ip unless this corresponds to a substantial description of inaccuracy of measurements. 

In spite of its interest, there are few existing results on the case of noise- free observation, with 
the important exception of the Kalman-Bucy filtering theory (for the latter we limit ourselves to 
noting that generalizations of (11. ip to the case of degenerate noise acting on Y date back at least 
at the paper [6], and we will not give detailed references for the linear gaussian case). In fact, 
we are only aware of [14] as the only paper entirely devoted to nonlinear filtering in the case of 
noise-free observation. In [14], X is defined as the solution to a stochastic differential equation 
in / = M n , the observation takes values in O = M. m and the function h in (|1.2p is assumed to 
satisfy special assumptions. The main result is that the filtering equation can be formulated as 
a stochastic equation on a submanifold of W n , and under appropriate conditions the law of IT 
admits a density with respect to the surface measure. These results are used in [8] to study the 
model (fl~3j) . 

The purpose of the present paper is a systematic study of the filtering problem with noise- 
free observation when X is assumed to be a time-homogeneous Markov chain in a finite set /. 
Thus, our basic data will be a pair of finite sets I and O, a function h : / — )■ O (assumed to be 
surjective without loss of generality), the rate transition matrix A of a Markov chain X in /, 
and the noise- free observation process defined by (jl.2p . The filtering process is specified by the 
finite set of scalar processes 

IL t (i)=F(X t = i\y°), iel,t>0, 

where y? = a(Y s , s € [0,t]). 

Our main results are the following. In section [2j after some notation and preliminary results, 
we present the filtering equations: they are a system of ordinary differential equations with 
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jumps and with random coefficients (depending on the observation process) and their unique 
solution provides a modification of the filtering process II. The method of proof is based on a 
discrete approximation: we first write down the filtering equations corresponding to observing 
the process Y only at times k2~ n (k = 0, 1, . . .) and we pass to the limit as n — > oo. 

The noise-free case has the following special feature: if at some time t we observe Y t = a € O 
then we know that IT has support in the corresponding level set of the function h, namely 
h~ l (a) = {i & I : h(i) = a}. So the natural state space of the process II consists of probability 
measures on I which are supported in one of the level sets h~ 1 (a) (a £ O). We call this space the 
effective simplex A e . In section^ after introducing an appropriate canonical set-up and solving 
the so called prediction problem, we establish that II is a Markov process in A e with respect to 
the filtration (3^): see Proposition 13.41 We also recall a known counterexample about the lack 
of ergodic properties of the filtering process in the case of noise-free observation: see Remark 
1331 

Since the trajectories of the observation process are piecewise constant, the law of Y is com- 
pletely determined by the finite-dimensional distributions of the process Yq,Ti,Yt 1 ,T2,Yt 2 , ■ ■ ■ 
where Tj denote the jump times of Y. In section [J] we find explicit formulae for those distribu- 
tions in terms of the filtering process II. Although this is mainly a technical point in preparation 
of the results to follow, it has some immediate application: see for instance Proposition l4.4l where 
we prove an explicit formula for the law of the exit time of a finite Markov chain from a given 
set, a result that we could not find in the literature. 

We note that in our model new information is available only at jump times Tj. Therefore it 
is not surprising that the filtering equations prescribe a smooth, deterministic evolution of the 
trajectories t h-> IT (a;) among such jump times, and a jump of II at each time Tj. An important 
class of Markov processes, having jumps at some random times and otherwise evolving along 
a deterministic flow, was introduced by M.H.A. Davis in [9J and named piecewise-deterministic 
Markov processes (PDPs). In section [5] we show that the filtering process IT is a PDP in the 
sense of Davis, and we explicitly describe its characteristics (the flow, the jump rate function and 
the transition measure) : see Theorem 15.41 Since PDPs are processes that have been extensively 
studied, see for instance the book [10], a lot of known results on PDPs immediately applies 
to the filtering process. For instance, a precise description of its generator is known, in terms 
of the characteristics. Moreover, we are able to show the Feller property for the process IT 
using arguments from |10| : see Proposition 15. 5\ where we prove in addition that the transition 
semigroup of II is strongly continuous in the space of continuous functions on the state space 
A e , equipped with the supremum norm. 

In section [6] we study an optimal stopping problem for the Markov chain X with partial 
observation. The functional to be minimized has classical form, but the moment of stopping 
is subject to be a stopping time with respect to the filtration generated by Y, i.e., it 

has to be based only on the observed process. We follow the classical approach to first solve 
an optimal stopping problem for the filtering process II, appropriately formulated and with 
complete observation, and then to show how this gives a solution to the original problem: see 
for instance |16j and the references therein for this approach in a general framework. Once more 
the theory of PDPs turns out to be a very useful tool here, since the existence of an optimal 
stopping time for IT, as well as a characterization of the value function and the stopping rule, are 
a direct application of known results on PDPs. We obtain corresponding results for the original 
problem with partial observation. 

When X is a finite Markov chain and the observation has the classical form (jl.ip . the 
corresponding process II is called the Wonham filter, see [20]. In spite of the simple structure of 
the unobserved process X, this case is still the subject of current study, see for instance [2] for 
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investigations on stability of the Wonham filter. In section 3 of that paper the following earlier 
example of noise- free observation due to jllj is analyzed: X is the Markov chain in the space 
I = {1, 2, 3, 4} with rate transition matrix 



In this specific case the filtering equations are deduced: see Proposition 3.2 in [2]. The method 
of proof is different from ours and relies on martingale methods. It should be mentioned at 
this point that several methods are known to prove that the filtering process is a solution of 
the corresponding filtering equations, in the case of noisy observation It is possible that 

some of them can be applied to deduce the filtering equations in the general case of noise-free 
observation as well. For instance one may try to generalize Proposition 3.2 in [2] mentioned 
above, or one may rely on the fact that the natural filtration of a jump process (in our case, 
the filtration (^°)) can be described in a precise way: see for instance [TU] or the earlier works 
[4]. However in this paper we present an elementary proof of the filtering equations, based on 
discrete approximation, that is self-contained and does not use deep results from the general 
theory of stochastic processes. 

We finally mention that the main results of this paper have been presented at the First 
CIRM-HCM Joint Meeting: Stochastic Analysis, SPDEs, Particle Systems, Optimal Transport, 
Levico Terme (Italy), January 24-30, 2010. 

2 The filtering problem 
2.1 Formulation 

The filtering problem will be described starting from the following basic objects, which are 
assumed to be given throughout the paper. 

1. I is a finite set. 

Elements of /, usually denoted by letters i,j, k . . ., are called states, and / is called state 



2. A is a rate transition matrix on / (sometimes called a Q-matrix, see e.g. [17|). i.e. a 
square matrix whose elements are real numbers satisfying Ajj > for i ^ j and 

3. h is a surjective function defined on / taking values in another finite set O. 

O is called the observation space. Its elements will be denoted by letters a, b, c — The 
assumption that h is surjective does not involve any loss of generality, since O may be 
replaced by the image of h in all that follows. 

Suppose that on some probability space (O, F, P) a process (X t )t>o is defined, taking values 
in /. (Xt) will be called the unobserved process. We assume that it is a Markov process with 
generator A, i.e. for every t, s > and every real function / on I we have 




(1.4) 



and Y takes values in O = {0, 1} with the function h in (|1 .2j) given by 

h(l) = h(3) = 1, h{2) = h(A) = 0. 



(1.5) 



space. 



E[f(X t+s )\lf] = (e sA f)(X t ), 



P - a.s. 
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where J$ = a(X s : s £ [0,t]), t > 0, are the cr-algebras of the natural filtration of (Xt). We 
denote by \x the initial distribution of (Xt), i.e. fx(i) = P(Ao = i), i £ I. 
Next we define the observation process {Yt)t>o by the formula 

Y t = h(X t ), t > 0, 

and we introduce its natural filtration (J^)t>o, where 

y? = a(Y s : se [0,*]), t > 0. 

The filtering problem consists in describing the conditional distribution of Xt given for all 
t > 0. In other words, we look for a description of the processes PpQ = ^3^?), i > 0, for all 
i E I. The main result of this section, Theorem I2.3| states that an appropriate modification 
of those processes, denoted by Ht(i), are the unique solutions of suitable differential equations, 
called the filtering equations, driven by the observation process. In addition, these equations 
are explicitly written and their solution is clearly described. 

In order to present those equations we need to introduce some notation and preliminary 
results, presented in the next subsections. 

2.2 The effective simplex and the flow 

By A we denote the set of probability measures on /. A can be naturally identified with the 
canonical simplex of M. N , where ./V is the cardinality of /, i.e. with the set of [i = (fJ<(i)) £ M. N 
such that > for alH = 1, . . . , N and YliLi MOO = !• 

We note that the sets h~ l (a) = {i € I : h(i) = a) form a partition of / as a varies in O. 
We denote by A a the set of probability measures on / supported in h ^(&). Each A a can be 
considered as a subset of A, so an element in A a is an ^-dimensional vector [i = in A 

such that fji{i) = if h(i) ^ a. This way each A a is a simplex in the euclidean space W N and it 
is a face of A, i.e. its vertices are also vertices of A. Moreover, as a varies in O, the vertices of 
the simplices A a form a partition of the vertices of A. 

Finally we define A e = U ae o^a- This is a subset of A, which we call the effective simplex. 
It is a proper subset unless h is constant. A e is obviously a compact space. 

As a general rule probability measures ji on /, or equivalently elements of A, will be con- 
sidered as row vectors of dimension N equal to the cardinality of /, whereas functions 
/ : I — y M will be identified with the ^-dimensional column vector (f(i)) consisting of its val- 
ues. The integral of / with respect to \i is denoted simply [if = X^ei" Given a row 
vector (y (i)) (not necessarily a probability measure) and a column vector (/(«)), both real and 
iV-dimensional, we denote by / * v the row vector obtained by pointwise multiplication, i.e. 
having components (/ * v)(i) = f(i)v(i), i £ I. 

In describing properties of the filtering process a basic role will be played by a flow <f> on 
the effective simplex A e . The following lemma is used to define <j> by means of a differential 
equation. 

Proposition 2.1 For every a € O and x G A a the differential equation 

v'{t) = Vi (a) * (y(t)A) - (j/(t)Al ft -i (a) ) y(t), t > 0, (2.1) 

with initial condition y(0) = x, has a unique global solution y : — > M. N , where N denotes the 
cardinality of I. 

Moreover y{t) € A a for all t > 0. 
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We will write (f> a (t,x) instead of y(t), to stress dependence on a and x. By standard results 
on ordinary differential equations, (j) a is a continuous function of (x,t) and it enjoys the flow 
property (f) a (t, 4> a (s, x)) = 4> a {t + s, x), for t, s > 0, a £ O and x G A . 4> a is the flow associated 
to the vector field 

Fa(y) = ift-i(a) * (v A ) - (yAi h -i (a) ) y, ye A a , 

on A a . To simplify the notation a little, it is convenient to define a global flow <p on A e = U ag o A a 
in the obvious way, setting (p(t,x) = 4> a (t,x) if x G A a . This way </>(£, •) is a function A e — > A e 
leaving each set A a invariant. 

The rest of this subsection is devoted to the proof of Proposition 12. II by means of a suitable 
viability theorem. For this we have to recall the notion of contingent cone (see, e.g., [1] Chapter 
1). 

Definition 2.1 Let X be a normed space, K be a nonempty subset of X and x belong to K. 
The contingent cone to K at x is the set 

T K (x) = jveX :\imini dK{ \ +kv) =0 
{ h^o+ h 

where dx{y) denotes the distance of y to K, defined by 

d K {y) ■= inf \\y - z\\. 

In other words, v belongs to Tk{x) if and only if there exist a sequence of h n > converging 
to 0+ and a sequence of v n G X converging to v such that 

Mn > 0, x + h n v n G K 

We need to compute the contingent cone to the set A a . 

Lemma 2.2 The contingent cone T/± a {y) to A a at y G A a is the cone of elements v G M. N 
satisfying 

T *=0 & { Vi Z°n f0 f ri *^ {a) h l( , (2.2) 

^ I Vi > 0, ifyi = 0fori€ /i _1 (a) 



ieh- 1 (a) 



Proof. Let us take v G T& a (y). There exist sequences h n > + converging to and v r , 
converging to v such that z n := y + /i n f n belongs to A a for any n > 0. Then 



ieh- 1 (a) n \ie/i _1 (a) iGh- 1 (a) 



0, 



so that s ^i(zh- 1 {a) v rii = 0- On the other hand, if i £ h (a), then y^ = z ni = 0, then v ni = 

Zn ' h Vl = 0, so that v ni = 0. If yi = for i G /i -1 (a), then v Hi = ^p- > 0, since z n belongs to A a . 
So u ni > 0. 

Conversely, let us take v satisfying (I2.2p and deduce that z := y + tv belongs to A a for t 
small enough. First we have 

y z%= y Vi + t Z Vi= Z m = L 

ieh- 1 (a) i£h- 1 (a) ieh- 1 (a) ieh~ 1 {a) 
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Second, if i £ h~ 1 (a), then V{ = and since y G A a also yi = 0. Hence if i ^ /i _1 (a) then 2, = 0. 
Moreover if i G h~ 1 (a), then Zj > 0. In fact if yj = then Zj = iuj > 0, because in this case Vi is 
nonnegative. If yi > it is sufficient to take t < #t for having Zj > 0. Hence z belongs to A a . 
□ 

Proof of proposition 12.11 A a is closed and is a viability domain of F a . This means that, for 
all y G A a , F a (y) belongs to lA a (y), the contingent cone to A a at y. In fact v = F a (y) satisfies 

ieh- 1 (a) ie/i _1 (a) ieh,- 1 (a) 

= ih-^a) * (y^h-^a) ~ {y^ l h-^{a))y l h-\a) 
= (ykl h -l(a)) ~ {y Al h-Ha))y l h^{a) 

= (yAl h -i (a) )(l - yl h -i( a )) = 0, 

since y G A a and ^ i6fe -i (a ) y* = ylh~ l (a) = 1- Moreover if i ^ /i -1 (a), then = [l/>-i( a) * 
(yA)]j — (yAl h -i( a j)yi = 0, since the components of V{ are both equal to zero. If yi = for 
i € /i -1 (a), then 

«i = [Ifc-i(o) * (y A )}i = Y y^ji - °' 

j&h- 1 {a),j^i 

since Ajj > for j 7^ i and yj > for j G h~ l {a). 

Moreover |F (y)| < C(l + |y|) for all y G A . Then by Theorem 1.2.4 in p], page 28, A a is 
viable under F a : for every initial state x there exists a solution y(-) to differential equation (I2.ip 
such that y(i) G A a for each t G [0, 00). The uniqueness follows from the fact that the function 
F a (y) is locally Lipschitz on R N . □ 



2.3 The operator H 



For every a G O we define a function i? a , mapping row vectors [i G R to row vectors H a \p] G 
defined for i G I by 



f 0, if ^ a, 

= — ^-i- if = a, Eiefe-i(a) 00') ^ °> 

if = a, Ejeh-Ha) »(J) = °. 



where f a is a fixed arbitrary probability on / supported in h 1 (a), whose exact values are 
irrelevant. Using the notation introduced in the previous section we may write 



We note that if > for all i then i? Q [//] is a probability measure on / supported on /i _1 (a), 
i.e. an element of A a . If in addition [i is a probability then H a \p] is the corresponding conditional 
probability given the event {h = a}. 

The relevance of the operator H to the filtering problem is well known. Indeed suppose that 
(Xk)keN is a discrete-time Markov process in / with transition matrix P and initial distribution 
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fi and the observation process is defined by Y/. = h(Xk)- Then, see for instance [TJ, the discrete- 
time filtering process defined for i G I by the formula Tlk(i) '■= P(-X"jt = i\Yo, . . . , Y&) satisfies the 
recursive equations 

C fr„ — pt__. rul 

(2.3) 



no = HjbM, 

n fc = ir Fk [n fc _iP], fe>i. 

2.4 The filtering equation 

Let (Tj)j>i denote the sequence of jumps times of (Yt), with the convention that Tj = oo for all 
j if no jump occurs and Tj < Tj + \ = Tj + 2 = . . . = oo if precisely j jumps occur. We set To = 0. 

We define a process (ILJ and we will eventually prove that it is a modification of the filtering 
process. For every uj G Jl we consider the corresponding trajectory Y t (ui) and jumps times Tj(u). 
Next we set Ho(uj) = Hy ^[fj] an d for j > 1, 

n t (w) = <P(t - Tj^{u),Il T ._ l{u) ) for Tj^uj) < t < Tj(u), 

n Tj -(u) = 4>(Tj{uj) - Ty-iH.n^^)), if Tj(u) < oo, (2.4) 

U Tj (w) = H Yt . H [n Tj _ ( W ) A] , if Tj (w) < oo, 

where </> is the global flow defined in Section 12.21 

Note that 11^- and IFr. are only defined on {Tj < oo} and on this set ILt 3 -(lo) is the usual 
left limit limt^.^^.^) U t (u). 

It follows from the properties of the operator H that ILj^. is supported on h (Yj ■ (j > 0); 
equivalently, II^ G Ay^.y Moreover, since the flow (j) leaves each set A a invariant, we conclude 
that II t is supported on h~ 1 (Yr j _ 1 ) for 3}_i(o)) < t < Tj(uj), (j > 1). 

The process (EE^) can also be described in the following way. Its starting point is Ho = Hy [p] 
and it has jumps precisely at times Tj (j > 1). At each jump time it jumps from ILp ._ to 
11^ = Hy T . Pt— A]. Among jump times, trajectories evolve deterministically. Namely, for 
Tj-i < t < Tj, we have U t (i) = if i£ hr 1 ^^) and 

Il' t (i) = (ILA)(i) - (ntAlh-i^^IItCi) (2.5) 

if % G hr {Yt--\ ) i where the time derivative is understood as the right derivative at time t = Tj-%. 
Using previously introduced notation we can write the differential equation in vector form 

U t = lfc-i^) * (ILA) - (UtAlh-i^^Ut t G [Tj-^Tj) 

and we could even replace lr _ 1 by Yt, since Y is constant among jump times. 

Finally, it is also possible to describe the process (lit) by a single integral equation: for every 

n t (w) = H Ya{u) [n] + ! {lfc-i(y.( w )) * (n s (w)A) - (n s (w)Al h -i ( y sM) )n s (o;)} ds 
+ E {^.H[nT,-(u;)A]-n T ,_(a;)}, f > 0. 

0<T 3 -(w)<t 

Note that only indices j > 1 enter the sum, since = To (a;) < Tj(uS). 

We will refer to (j2.6[) as the filtering equation. This is justified by next theorem. 

Theorem 2.3 The process (lit), defined by equation \2. 6\) , is a modification of the filtering 
process: for t > and i G I we have Tl t (i) = W(X t = i\y® ) F-a.s. 
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Recall that in the definition of the operator H we used some arbitrary probabilities, denoted 
v a . It is worth noting that they do not affect the definition of the filtering process, in the sense 
that we have, P-a.s., 

n T . = h y [n T ._A] = U ^~ A — (2.7) 

for every j > 1. Therefore, the filtering equation (|2.6|) can also be written as follows: 

n t (w) = H Yo t U )\p\ + I {h-uYJu)) * (n s (w)A) - (n s (w)Ai h -i (ysM) )n s (w)} ds 
^ [ n T _(w)A 1 

To prove (|2.7p it is enough to show that the denominator ITr._Aiy never vanishes. This is the 
content of lemma 12.41 which is stated in a slightly more general form, as required in the proof 
of theorem 12.31 that will follow. 



Lemma 2.4 Let (ILji>o be a process with nonnegative components Ht{u, i) satisfying, for some 
fixed j > 1, the following conditions: 

1. for Tj_i(o;) < oo and Tj_i(w) <t< Tj{uj), we have Ht(uj,i) = if i £ h~ 1 (Yr j _ 1 (uj)) and 

IL>,i) = (n t (w)A)(i) - (n t (a;)Al /l - 1( y T ._ i(a))) )n t (a;,i) (2.8) 

ifieh-HYr.^cj)); 

2. for every t > 

Itt(i)lT j - 1 <t<T j =nX t = i\y?)l Tj _ 1 < t< T j P-a.s. (2.9) 

T/ien 

n Tj -Al h -ipr j > P - a.s. on {7} < oo}. (2.10) 
Proof. First we prove that for all i G I 

{n Ti _(i) = o,r^<oo}c{x ri _^* s r i <oo} 1 p-a.s. (2.11) 

We fixwe!! such that Tj_i(u;) < oo, and argue for Tj_i(u;) < t < Tj(u). Define 

E t := exp (J ILjAlfc-x^jds^ , I7 t := £ t n t . 

Since £7f > 0, t/j and lit have the same support. Then Ut(i) = if % ^ h (iV^)- Now let 
i G /i -1 ). It follows from (|Z3|) that J7 satisfies U' t (i) = (U t A)(i) so that 

This implies that f/ t (i) > e A «(*- s )£7 s (z) for < s <t <Tj, hence [7 a (i) > implies > 
for all i G [s,Tj) and in particular > implies ^(i) > for all i G \Tj-\,Tj). The same 

result holds for II t (i), so we conclude that, for every i G J, the process 

lnttt^olTj-i^lKTj (2.12) 
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has at most two jumps on [0, oo). We also deduce that, if Tj < oo, 

n ^-« = t J^ <T] = t J^ <Tj E t' U t * E^ T ^U s (i), Tj-i <s<T 3 
and we also conclude that, for every i G /, on {Tj < oo}, 

n T ._(i) = o ^ n t (i) = o vt g K-i,^). (2.13) 

On the other hand, by (12, 9p . for every t > 0, 

P(X = i,n t (») = 0,7^! < f < Tj) = Eflx^lntCO^lTj-^T,] 

= Efl^^^lT.^^r.P^ = i\y t )\ = E[l Ut{i)=0 l Tj ^< t<Tj U t (i)} = 0. 

Therefore the process lx t =iln t (i)=olT 7 _i<t<r ;7 is a modification of the zero process. Recalling 
the process in (|2.12p and the fact that X is piecewise constant, it is also indistinguishable from 
zero. 

Now we are ready to prove (|2.1ip . which is equivalent to P(n^_(i) = 0, Xt } — = i, Tj < oo) = 0. 
Suppose that for some w £ O we have Tj < oo, IIr._(i) = and Xt— = i. By (12.131) we also 
have EE^(z) = for all £ G [Tj_i, Tj) and denoting by S 1 > Tj_i is the last jump time of the chain 
X before Tj it follows that Ht(i) = and Xt = i for all £ G [5, Tj). However, since P-a.s. we 
have lx t =iln t (i)=olT i -i<t<T ; j = for every t, this is only possible with zero probability. 

Now we are able to prove (12.10p . Noting that IIr._ is supported on ft. (Ir^-J, we have, on 
{Tj < oo}, 

U T ^M h - 1(YT ^ = n Tj _(i)A(i,fc). 
<eh- l (y Tj _ 1 ),fceh- l (yT J ) 

Since Tj is a jump time for Y, the sets /i _1 (lr j _i) and h^iY^) are disjoint, so that in the 
previous sum all nonzero terms correspond to indices i ^ k and consequently all terms are 
nonnegative. Since Xy. G h (Yr,) it follows that 

n T ._Al ft -i {yT .) > £ n rj ._(i)A(i,X r .) = ^n Tr (i)A(i,X Tj ) on{Tj<oo}. 

To conclude the proof it is enough to show that ¥(^2 ieI HT j -(i)X(i, Xr.) = 0, Tj < oo) = 0. 
Since again all terms in the last sum are nonnegative we have 

P (£ ie/ n r ._(i)A(i,X r .) = 0) = P (n ieI [U Tj ^)X(i,X Tj ) = 0,Tj < oo]) 

= p (n je/ [{n Tj ._(i) = o,Tj < oo} u {X(i,x Tj ) = o,Tj < oo}]) 

< P (n ieI [{X Tj - + i,Tj < oo} U {X(i,X Tj ) = 0,Tj < oo}]) 
= P(n ie/ [{X Tj _ / i, X{i, X Tj ) > 0, Tj < oo} 
U{A(i,X Tj ) = 0, Tj <oo}]) 

where we have used (12. lip in the inequality. But clearly 

P(n je/ [{X Tj ._ ± i,X(i,X Tj ) > 0,Tj < oo} U {A(i,X T .) = 0,Tj < oo}]) = 

since, at time Tj, the chain X must jump from some state i G I to Xx ■ □ 

The rest of this section is devoted to the proof of Theorem 12.31 The method we use is to 
construct a sequence (II" (i))t>o,ie/ °f approximate filtering processes, each corresponding to 
observing the process Y in a discrete set V n of times. II" (i) are constructed so as to converge to 
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P(Xf = i\y®) as n — > oo. Then we write down explicit filtering equations for II™ (i). Passing to 
the limit in these equations we prove that IE(i) converges to the solution IT(i) of (|2,6p . This 
way we identify 11^ (i) with a modification of the filtering process ¥(Xt = i 1 3^t* ) - 

For all integers re, k > we set = 2~ n k and consider the grid V n := {^}fc>o with mesh 
— = 2~ n . For fixed re and for i > we introduce the cr-algebras y™ = cr(Y t n, . . . , Y t n : 
£fc < t). The filtration (3^ n )t>o corresponds to observing the process Y only at times 

For any i > 0, we have 3^" Q and 3t° is generated by U n y^, so by a martingale 

convergence theorem we have 

lim ¥(X t = i\y?) = ¥(X t = i\y°), P -a.s., i € /. (2.14) 

n— >oo 

Next we show that the filtering processes P(X^ = ity™) have modifications, denoted IE(i), which 
satisfy explicit filtering equations. Recalling that re denotes the initial distribution of X, let us 
introduce processes (TLt(i))t>o,iei as follows: for all re > and define 

nS(w) = fZy o(w) M, (2.15) 

n?(«) = U^Ju)e^ tn ^ A ii tU<t<4, (2.16) 

n&( w ) = Hy tniu) \n&_(u)] (2.17) 



for fc > 1, where of course IE_(w) = IE, (w)e ( *^-i )A = IE. (w)e 2 " A . 

fc fc— 1 fe — 1 

Lemma 2.5 For i > and i e I we have IE(i) = P(X t = i|^ t n ) P-a.s. 

Proof. We fix re and to simplify the notation we write instead of tV:. For every re and t 
we denote LT" (i) an arbitrary version of P(Xj = We consider the discrete-time processes 

(X 4fc ) fc >o and (y tfc )fc>o obtained evaluating (X t )t>o and (Yt)t>o at times t k . (X tk ) k > is a Markov 
chain with initial law re and transition matrix P = e(* fc_<fe - 1 ) A = e 2 " A . Therefore, for every 
k > 0, 5£(») = P(X tfc = i|y 0j ...,Y tk ) satisfies, P-a.s., 

n^ = ^y>] (2.18) 

and the recursive equations 

fi£ = H Ytk [Ul^P] = H Ytk [fi^e***-**-^], (2.19) 

compare (|2.3|) . 



Next we fix t k -i < i < tfc, we take / : J — >• M (identified with / € R^) and, using the fact 
that o"(y^) C c(Jft-) for every j, and the Markov property of (X t )t>o, we obtain, P-a.s., 

E(f(Xt)\y?) = HfiX^Yo,...,^) 

= E[M(f(X t )\X ,...,X tk _ 1 )\Y ,...,Y tk _ 1 ] 

= E^-^fXX^Yo,...,^} 

= tlle^-^f, 



which shows that 



IE = Tl? k _ i e it - tk - l)A . (2.20) 



Comparing (pT8l) . (|2TT5|) . (1230]) with (pT5l) . (j2TT5j) . (j2~TTj) . we conclude that IP and IE are 
modifications of one another, and this proves the lemma. □ 



11 



Equations (|2.15|) - ()2. 17j) describe the time evolution of the trajectories of II". For k > 1, II" 
satisfies the differential equation 

(D?)' = n?A fort]J_ 1 <t<tg. (2.21) 

where the derivative is understood as the right derivative at t = £jk_i> At each time i^, II" takes 
on the value Hy t „ [Hf n _], thus possibly making a jump. Therefore equations (|2.15p - (|2.17p can 

k k 

also be written as a single integral equation, namely 




t > 0. (2.22) 



In what follows we will often use the fact that II" is continuously differentiable on each 
interval k > and moreover from (|2.21|) it follows that there exists a constant C > 

(depending only on A) such that for all oj G Q and for all t £ V n 

m)'(u)\ = \I%{u>)k\<C. (2.23) 

Proof of Theorem 12.31 We consider again the jumps times Tj (j > 1) of the process Y and 
for every n define = mm{tf G V n : Tj < tf} and t] = max{tf G V n : tf < Tj} on {Tj < oo}; 

= = oo on {Tj = oo}. Note that, P-a.s., no time Tj belongs to V n for j > 1. Then, for 
sufficiently large n (depending on w), we have i" < Tj < t" < i" +1 < T J+ i < i~" +1 on {Tj < oo}. 
We also define T = ift = = 0. 

For each j > we consider the following statements: 

a,-) np„ -»■ n Tj , P-a.s. on {Tj < oo}; 
6j) we have, P-a.s., 

sup - IT | -> on {T j+ i < oo}, 

te[t£,T i+ i) 

sup |II" — II t [ — > for every T > on {Tj < oo,Tj + i = oo}; 

Cj) for all i > and i 6 J 

n t (i)l Tj <t<T J+1 = P(A 4 = i|3^)lT,<t<r J+1 F-a.s. (2.24) 

Note that ao trivially holds, since IIq = Hy [fj] = Ho- We will prove that the following implica- 
tions hold for all j > 0: 

dj bj (2.25) 
bj C j (2.26) 
6j + Cj => oj+i (2.27) 

By induction on j it follows in particular that Cj holds for all j > 0. This implies that for all 
i G I and t > we have 

Il t (i)=F(X t = i\y°) P-a.s. 

and concludes the proof of the theorem. 
Now it remains to prove ([2T25]) - ([2T27]) . 
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Proof of (12.251) . Suppose that aj is verified for some j > 0. First fix u> such that Tj+\(uj) < 
oo. On the interval [Tj,Tj+\) the process Y is constant. For short, we denote a := Yt for all 
t G [Tj,T j+1 ). It follows from ([2~TTj) that II™„ is supported in h~ l {a) for i£ G t^Mj+i)- 

First take i ^ h {a). For i G there exist two points q,q +1 G V n such that 

i] <q<t< q +l < By (^2Tp we have 



UUi) = [\u:A)(i)ds. 

Jtv- 



(2.28) 



But n™ (i) = for each iJJ G [i",T j+1 ); hence D£(i) = //„ (HyA)(i)ds and from (g^g) it follows 
that |D£(t)| < C(t - q) < C2~ n . We conclude that sup^^ j D£(t) -»■ as n -»• oo for 
i ^ /i -1 (a), or equivalently 

sup |lj\fc-i( a ) *n?| -»• 0, 



which also implies 



sup |l AA -i (a) *(D?-II t )|-X), 



tm,T j+1 ) 



(2.29) 
(2.30) 



since U t (i) = for i £ h" 1 ^) and t G [3), T i+ i) C [t™,T j+1 ). 

To study II™ (i) in the case i G /i _1 (a), we define for t G [t",i" +1 ) 

m n (t) 



Since Eig/ n ™00 = 1 for a11 t > 0, by (|2351) we have 



sup 



E n ro')-i 

je/i- 1 ^) 



so that m n (t) is well defined and 



sup |m n (i)-l|^0. 



(2.31) 



We also note that m n is cadlag on [q, and on each interval [t£, contained in [t™, 

it is continuously differentiable and satifies m n (q) = 1. 
On the interval [t™,t™ +1 ), if i G /i _1 (a), IT™(i) satisfies 



n?(») = n|«+ [(i%A)(i)ds+ E (^ a [n|_]«-n|_(^ 

*" t™<tv<t 

3 K 

r l ^ ( n™_(i) \ 

= u%(i)+ [\i%A)(i)ds+ E nrn_(i)(m n (^-)-m"(^_i)) = 

3 I In _ * * K 

l 3 t?<t2<t 

3 K 

= n|(i)+ AnjA)(<)cfa+ f E n |-W 1 [^ 1 ^)( s )(^ n )'( s ) rfs - 



"3 tV-<tl l <t 
J re — 
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In the third equality we used the fact that m n (t^) = 1 for i£ € [i™, t™ +1 ). Adding and subtracting 

Jj n Yl n s {j)(m n )'{s)ds we get 

j 

n?(i) = □?.(*)+ f (n"A)(i)cfe+ r^coM'Wds + ^Ci) (2.32) 



where 



v}<et<t 



(m n )'(s)ds. 



Thanks to (|2.23p . which also implies that \{m n )'{s)\ < C for a.e. s, we obtain 

sup |^4? (i) I as n ^ oo. 

*6[*7>*?+l) 



(2.33) 



Moreover we can compute 



(m n )'(t) 



j2 mr+z n (t), 



where 



jG/i-!(a) V 



(E iefc -i(.) n ?(i)) 2 



£ n-(i)'(i-m"(t) 2 ) 



and we have 



sup |-B n (*)| -> as n -> oo 



(2.34) 



by (12331) and fl2jgp . Now (I2T32D becomes 

n?(i) = n&(») + f\u-A){i)ds- /'n?(») £ n^(j)'^+ 

+ f_Il n s {i)B n {s)ds + A^{i) 
J ft 

= n^(i)+ /*(n?A)(i)cfa- /*n?(i) £ n?0')'«k + c?(i) + ^« 
^ ■«? i6fc -i ( .) 



where we have defined 



Jt n 



(2.35) 



By (HSU, for * E [***), a £ P n , we have £ iefc -i (a) n?0')' = £ i6h -i(«)(n?A)tf) = n?Al fc -i (a) 
and we get 

n?(t) = D&(0 + f\n n s K){i)ds- rn?(t)(n?Ai fc -i (tt) )cfa + c?(t) + ^ n (t)- 

We denote with ^4" and C" the iV-dimensional vectors with components A£(i) and C"(i), 



respectively, if i € /i (a), and otherwise. Introducing the vector field G a defined by 
G a {y) = l ft _ 1(a) * [(i/A) - (uAl h -i {a) )u], i/GA 
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we finally arrive at 

l ft - 1(a) * n? = l A _i (o) * Ufn + f G a (Ii n s ) ds + C? + t G [%,T j+1 ). 



Now we are able to obtain an estimate on the difference l/ l -i( a ) * [lit — II™] for t € [t™, Tj+i). 
We recall that II t G A a for all t G [Tj, Tj+i), so that fj2.5[) implies 

i h - Ha) * n t = i h -i (a) * n rj + / G a (u s )ds, t g [Tj,T j+l ). 

JTj 

So, for i G [^,T i+ i), 

|ift-i(a)*(nt-n?)| < |i ft - 1 (a)*(n T .-np ? )|+ f J \G a (n s )\ds + \c?\ + \A?\ 

+ f \G a (U s )-G a (U^)\ds 

Note that G a is bounded and globally Lipschitz on A. We denote K some bound and by L 
its Lipschitz constant (K and L depend on u, since a = Yy. (w)). Then we obtain, noting that 
£h - 7} < 2" n 

|V: (a) * (H T . - + £ \G a (U s )\ds + |C?| + \A?\ 

< |i^-i(a) * (n T . - + + \cn + \A?\ 

J 

ll^aMlL-nni < |l ft -i(a)*(n r . -Til)\+K2- n +\C?\ + \Ay\ 

3 

+l [ in s -mids. 



and 



Since lh~ 1 (a) * n s = n s we have |II S — II™ [ < [ l^-i ( a ) * (H s — n n )[ + |lj\fi,-i( a ) * XI S | and we obtain 

* (n* - d?)| < |i h -i (a) * (n T . - n^)| + K2~" + |c?| + K| + l f \h\ h - Ha) * n?| ds 

+L [ \l h - 1{a) *(U s -U^)\ds 

= :D^ + L f \l h - Ha) *(U s -U^)\ds. 

Prom the induction assumption cij and from (|2.29p . (|2.35p . (|2.33p . (|2.34p it follows that 
sup tg pn jTj+1 ) \Df\ — >• 0. From the Gronwall lemma we conclude that sup^^y ) \l h -i(a) * 
(IT t — II™)| — > 0. Recalling (|2.30p . this proves bj in the case Tj + i(u) < oo. 

The case Tj{uS) < oo, Tj + i(uj) = oo can be proved by the same arguments, replacing the 
interval [t™,T J+ i) by [$j,T) in the previous passages. 

Proof of (12.261) . Assume that bj holds for some j > 0. For every i & I and t > 0, by lemma 
we have 

ITO l Tj <t<T j+1 = = i\y?) lT,< t <T, +1 , P-a.s. 
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Let n — > oo. Since t™ — > Tj, for large n we have tj < t and bj implies that 11" (i) — > II t (i). 
Recalling (|2.14j) we conclude that 

nt(i) l Tj <t<T J+1 = nXt = W?) l Tj <t<T i+1 , P-a.8. 

Since P(Tj = i) = this shows that Cj holds. 

Proof of (|2.27p . We suppose that bj and Cj hold for some j > 0. We write H n (i,t) and 
U n (t) instead of ITf (i) and for better readability. We recall that, by for Tj < t < T j+1 , 

we have Ilt(i) = if i ^ /i~ 1 (y Tj ) and 

nj(i) = (n t A)(i) - cntAifc-i^j)^^) 

if i G /i _ (ir,). Since we also assume that Cj) holds, Lemma l2~il ensures that 

U(T j+1 -)Al h - H Y T . +i) > on {T j+ i < oo}. (2.36) 

Now we fix oj such that Tj + \ < oo and we wish to prove that lim n _ ) . 0O II n (i™ +1 ) = nr j+1 . Note 
that Y T ... = Y {n so that if i £ ^ _1 ( y T, +1 ) = ), then n n (M™ , ,) = = n T , +1 (i). Let 

now j G ^"H^+J; then, since by (f2TTD we have U n (i] +1 ) = H Yfn i [n n (t™ +1 -)], we therefore 
have 



provided the denominator is not zero, and we will check that this is indeed the case. Thanks to 
(|2.2ip we can compute the right-hand side by a Taylor's expansion around t?j+i : 

rr(*7 +1 -) = n»(^ +1 ) + (u n )\t] +1 ) + o(t] +1 -t] +1 ) 

Noting that - t? +1 = 2 ~ n and that U n (i,t] +1 ) = Il n (k,t™ +1 ) = for k G /» _1 (lr i+ i) we 
obtain 



n n r ,rn . 2~ n [IP (f A] (j) + o(2~ n ) 

! ' J+lJ 2- Z keh _ HYT ^JUn{t-)A](k)+o(2-ny 



Since tends to Tj+i—, and since 6j shows that U n (t) converges to IT uniformly in a left 
neighborhood of we have n n (i™ +1 ) — > n(T J+ i— ) and we finally obtain 

We note that the right-hand side of (|2.38p is well defined by (|2.36p . For the same reason the 
denominator in ()2.37j) is strictly positive for large n, which justifies previous passages. Finally, 
since U Tj+1 = Hy T +i [II(T i+ i-)A] by (J23D, (p3HD also shows that lim^oa U n (i, = U Tj+1 (i) 
and concludes the proof that a,+i holds. □ 



16 



3 Basic properties of the filtering process 
3.1 Canonical set-up 

In the rest of this paper it is convenient to assume that the unobserved process is defined in a 
canonical set-up as follows. 

1. Let f2 be the set of cadlag functions oo : M + — > I, i.e. the set of right-continuous functions 
having finite left limits on (0, oo). We denote Xt(oo) = uj(t) for u G 0, and t > 0, and we 
introduce the u-algebras 

J? = <r(X a : s G [Q,t]), T° = <r(X t : s > 0). 

(Jf)t>0 is thus the natural filtration of (X t )t>o- 

2. Let A denote the set of probability measures on 7, identified with the canonical simplex 
of R N , where ./V is the cardinality of 7. 

3. For every /i£ Awe denote by P„ the unique probability measure on (CI, .7-°) that makes 
(Xt) a Markov process on 7 with generator A and initial distribution [i, i.e. such that for 
every t, s > and every real function / on 7 we have 

E^[f(X t+s )\J=f] = (e sA f)(X t ), - a.s. 

and P^(Xq = i) = fi(i), i G 7. Here denotes of course the expectation with respect to 

If \i is concentrated at some i G 7 we write Pi instead of Pa- 

4. We still define the observation process {Yt)t>o and its natural filtration (y^)t>o by 

Y t = h(X t ), = a(Y s : s G [0, t}), t > 0. 

Remark 3.1 We could also define the space CI as the set of all functions oo : K+ — > 7 which 
are piecewise-constant, right-continuous and with a finite number of jumps in every bounded 
interval, i.e. of the form oo(t) = Ylh=o a k^[t k ,t k+1 )(t) f° r a k G K and for = to < *i < *2 < • • • 
where the sequence (tk) has no cluster point in [0, oo). The other definitions remain unchanged, 
and all the subsequent results still hold. 

Let (T n ) n >\ denote the sequence of jumps times of (Yt), with the convention that T n = oo 
for all n if no jump occurs and T n < T n+ \ = T n+ 2 = . . . = oo if precisely n jumps occur. 

For every oo G fl we consider the corresponding trajectory Yt(oo) and jump times T n (oo) and 
we define the filtering process 11^ (w) as the solution of 

nf ( w ) = 77 roH [/x] + 1 {U-i(y sH) * Cn?(w)A) - (n^( w )Ai h - 1(y , (w)) )n^( w )} da 

+ E {Vm^HaI-^.m}, t>o, ( ' kU 

0<T„(u;)<t 

where, as usual, D^. _(w) = ^at-^T n (u),t<T n (u) ( w ) 1S defined on {a; G Cl : T n (oo) < oo}. 

Equation ()3. 1 j) is the same as (|2.6p and, as explained before, uniquely determines a (3^°)- 
adapted, cadlag process (nf ) taking values in the effective simplex A e = U ae oA a . By theorem 
12.31 for every \i G A, t > and i G 7 we have II^(z) = P^Xt = i\y\ ?), 7^-a.s. and, consequently, 
nf (i) = P M (X t = i[3^), P^-a.s. 

In what follows the process (11^) will also be considered under a different probability P p with 
p G A, p 7^ /i. In this case the equality II^(i) = P p (X t = ityt) is generally false. 
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Remark 3.2 We note that the process (11^) depends on \x through its initial value ITq = Hy [fj]- 
In general ITq is a random variable and ITq ^ (i. However if the unobserved process (Xt) has 
initial distribution v in the effective simplex A e then the filtering process (IT^) starts at v, P v -a.s. 
Indeed, if v € A Q , then Yq = a P u -a.s. and since H a [v] = v it follows that IIq = v P u -a.s. 

3.2 Prediction 

Preliminary to further properties of the filtering process we need to prove the following result 
which has an intrinsic interest, since it solves the so called prediction problem: at any time t 
it allows to compute the distribution of the unobserved process after time t conditional on the 
available observation up to t. 

Proposition 3.3 For every fi € A, t > and T € T° we have 



for every present time t, and conditional on the past observation of Y up to t, the probability 
that the future trajectories of X will belong to some set T is best predicted by P n ^(r), i.e. one 
computes P M (r) and replaces fx by IT^. 

Proof. Noting that y® C Jf, by the Markov property of X and the fact that TP is the filtering 
process we have 



p,(x t+ . e r|Df) = E,[p,(x t+ . € r|j?)|tf] = E,[p Xt (T)\y»] = ^p,(r)nf(i). □ 



It can be easily verified that Rt is a Markov kernel on (A e ,£>(A e )). The following proposition 
asserts the Markov property of the filtering process (IT^), corresponding to arbitrary fixed initial 
distribution fj, G A of the unobserved process (X t ). 

Proposition 3.4 (R t ) is a Markov transition function on (A e ,i3(A e )) and for fj, £ A, t, s > 
and A G B(A e ) we have 



In other words, for every fj, G A, in the probability space (fi, J 70 , P^) the process (IT^) is a Markov 
process with respect to (3^°), taking values in A e and having transition function (Rt)- 

Proof. We first introduce a family of stochastic processes (ITj(i/)) parametrized by v € A e . For 
every u £ fi we define IL t (oj, v) as the solution of 




3.3 The Markov property of the filtering process 

For t > 0, v £ A e and A € B(A e ) (the Borel a-algebra of A e ) we define 



R t (v,A)=P u (n» e A). 



p,(n? +s eA\y?) = R s (n?,A) 



U t (uj,u) = u+ j {1^-1(^(0,)) * (U s (uj,v)A) 



+ E { H Y Tn ^T n -(^v)k\ 





(3.2) 



0<T n (uj)<t 
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Since v belongs to A e , (n t (z/)) takes values in A e and it is a (3^°)-adapted, cadlag process. 
Moreover — >• n^(u;,i/) is measurable with respect to J 70 x B(R+) x B(A e ). 

Note that the pathwise evolution of (Tlt(u)) is described by the same differential equation 
as for (n^), but these two processes differ in general because of the initial condition: (Ilt(f)) 
starts at v, whereas Hq(uj) = HY r u )\jA- Clearly, we have 11^ (w) = n t (w, IIq (a;)). If /x = v € A e 
then, as noted in remark \3. 21 we have IIq = v and consequently (11^) and (Hj(V)) are the same 
process. 

Let us define the translation operators 6t : 0, — > SI by (9tto)(s) = u(t + s) for i, s > 0. By the 
uniqueness of the solution of equation (|3.2p we have, for all t, s > and ui G f2, 

n t+s (o;,i/) = ILj(0 t w,II t (u;,i/)) 

and replacing f by 11^ (w) we obtain 

n£_>) = iWw,n£(w)) =n.(e tW ,n t (a;,ng(w))) = n s (%^,n£»). 

Fixing t, s > and noting that 11^ is ^-measurable it follows that 

p„ (nr +s g = p, (n.(0 t (-),n?)) e Aj;y t ) = 5 (n?), p„ - a.*., (3.3) 

where we define 

g(p) := P M (n a (5 t (-),p)) G , p G A e . 

Note that the event {II s ((?t(-), p)) G A} can be written in the form {Xt+. G T p } where T p := 
{oj G £7 : II s (a;,p)) G vl}. So it follows from proposition 13.31 that 

g(p) = P M [X t+ . G r p \y?) = P h m (T p ) = P h m (H s (p) G A) . 

Replacing in fj3.3j) we obtain the required equality P p (U^ +s G A|3f) = g(U^) = R 8 (I\%,A). 
The Chapman-Kolmogorov equation for (Rt) now follows from this equality and the fact noted 
earlier that IT^ = n t (V) for v G A e . □ 

Remark 3.5 (Filter instability). An important issue, both for theoretical and computational 
viewpoint, is the stability of the filtering process. Stability is often formulated in terms of 
appropriate ergodic properties of the Markov filtering process. It is an interesting fact that 
stability essentially fails to hold in the case of noise-free observation under consideration. This 
is the content of section 3 of [2] , where the authors consider the Markov chain with rate transition 
matrix given by (jl.4p and the observation process corresponding to the function h defined in 
(jl.5l) . They show that the filtering process has infinitely many invariant measures, and that the 
solutions of the filtering equations corresponding to different initializations do not converge to 
one another: see subsection 3.2 in [2] for a detailed discussion. 

4 The filtering process and the distribution of the observation 
process 

In this section we show that the law of the observation process can be described by means of 
the filtering process through explicit formulae. These results will be applied in section El but 
they also have some intrinsic interest: we present an application in subsection 14.11 

We will use the canonical set-up introduced in subsection 13.11 we fix p G A as an initial 
distribution of X and we construct the probability space (O, J-°, P«). In the rest of this section 
all stochastic processes will be considered under P„. 
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Let (T n ) n >i be the sequence of jumps times of Y, with the convention that T n = oo for all n 
if no jump occurs and T n < T n+ \ = T n+ 2 = . . . = oo if precisely n jumps occur. We let To = 0. 
In the following we will consider the sojourn times S n = T n — T n _i and the positions ir n _i> 
ir n _ x — , of Y at jump times and immediately before jump times respectively (n > 1). These 
random variables are only defined on the event {T n _i < oo}. Nevertheless the tr-algebras 

= a(Y , T x , Y Tl , . . . , T n _! , Y Tn _, ) , = *(Y , T Xl Y Tl , . . . , T n _i, It^ , T n ) , (4.1) 

can be defined in the usual way. In particular So = (t(Yq), = (t(Yq,T\). 

Since the trajectories of Y are constant among jump times, the law of Y is completely 
determined by the finite-dimensional distributions of the stochastic process {Yq, Ti, Yt 1} T^, Yt 2 , 
. . .}. These in turn can be described by specifying the distribution of Yq, which is obvious, and 
the family of conditional probabilities 

P^Sn >t,T n _! <oo | P M (F Tn =6,T n <oo|S+_ 1 ), t > 0, b G O, n > 1. 

In order to present explicit formulae for these probabilities we need to introduce some nota- 
tion. For v £ A e we define a probability q{v) = (q{v, b))b^o on O in the following way. For every 
a S O we first fix an arbitrary probability q a = (q a (b))beO on O supported in 0\{a} (we exclude 
the trivial case where h is constant); the exact values of q a are irrelevant. Next, if v £ A a for 
some a € O, we define for every b € O 

— lty a , if i/Al fc -i (a ) ^ 0, 

Qa(b), if fAl/j-i^) = 0. 

To check that q{v) is a probability measure we first note that for v € A a and b ^ a we have 

i / Al /l -x( & ) = ^ ^ fjAij > 0, 
ie?i- 1 (a) jeh- l Q>) 

since the sums are extended to distinct indices i,j and therefore \j > 0. Moreover for v € A a 

b£0,byta beO 

since Al = 0, so that — v Al h -ir a \ > and q{y, ■) is in fact a probability measure. Note that if 
v € A a and z/Al/j-i( ) = then also i/Al^-x^) = for all 6 ^ a and consequently the equality 

- f Al h -i (a )g(i/, 6) = i/Al h -i(6), (4.2) 

holds for all v £ A a and b ^ a. Finally note that if v 6 A a then g(^, a) = 0. 
Let (11^) denote the the filtering process, solution of (|3.ip . 

Theorem 4.1 Tei /i£ A. T/ien for t > 0, 6 E O, n > 1 we have 

P^{S n > t,T n -i < oo|S n _i) = exp (^J 0(s,II^ n JAl^-x^ jds^ 1t„_i<oo, (4.3) 

P^Y Tn = b,T n < oo | E+_J = q (D£ n _,&) l Tn<0O = q ^(S n ,U^ n i ),b) l Tn<OQ . (4.4) 
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Remark 4.2 Recall that S n _i and were defined in (|4.ip . The equalities (|4.3 p -(|4.4 p may 

be written 

Pfi{S n > t 1 S n _i) = exp ^ (j)(s,Il^ n _ i )Al h -i(Y Tn ^ds\ on {T„_i < oo}, 



P M (Y Tn = 6 [ = q (n^_,6) = g (#SW,n£ n l ),&) on {T„ < oo}. 

The intuitive meaning of (|4.3[) - (|4.4j) is as follows: suppose we know Yq, T\, Hj, , T n _i,Yr n _ 1 , 
or equivalently we know the trajectory of Y up to the jump time T n -±. This also determines the 
trajectory of IP up to T n —\. Suppose that ir n _i = a, IIj, _ = v G A a . Then the probability 

that the sojourn time will exceed t is exp M"* (f>(s, v)Kl h -i^ds\. If the following jump occurs 

at time T n < oo then 1" jumps to b with probability g (IT^ _, 6) , which depends on the position 
IIj, _ = <f>(T n — T n _i, v) of the process IP immediately before the jump. Thus, conditionally on 
the "past" (Yt)o<t<T n -i > the distribution of the "future" (lt)t>T n _! is determined by the present 
value of the filtering process IIj, . 

As a preparation for the proof of theorem 14.11 we need the following lemma. 

Lemma 4.3 Let j\ denote the number of jumps ofY to b G O in the time interval (0,t\: 

OO 

J t = ^2 lT k<t l Y Tk =b, t > 0. 

k=l 

Then the process 



M t := J\ - [ Il£Al h -x (6) 1 Y ,& ds, t > 0, 
J o 



is a martingale with respect to (y®). 
Moreover 

E„ [M tATn - Mt/\T n _i | S„_i] =0, t > 0, n > 1. (4.5) 
Proof. We start recalling the well known fact that for every / : J — > R the process 

ntf-nft/- fu^Afds, t>o, 

Jo 

is a (3^°)-martingale (see e.g. [18] formula VI-(8.17)). Choosing / = lj, o h = 1^-1(6) we have 
nf/ = ^[l fe (y<)l^ ] = h(Y t ) and we deduce that 

mt := l 6 (Y t ) - l b {Y ) - [ njfAlfc-iftj ds, t > 0, 

J o 

is a martingale. Next we set 

M t := / l 7 \{ fe }(y s _) dm s = / l Ys _^bdm s , t>0. 
J(o,t] J(o,t] 

Mt is defined as a pathwise Stieltjes integral and it is a martingale, since the integrand process 
is (3^°)-predictable and bounded. Finally we note that 



M t = [ l Y .-&dl b (Y a )- [ l Ys _^Ah-i(b)ds 
J(o,t] ^ J(o,t] 

= 4-1 ly,^n{fAl ft -i (6) ds. 

Jo 



21 



The last equality holds since dlb(Y s ) is a measure equal to 1 at each point where Y jumps to b, 
and equal to —1 at each point where Y leaves b, and since we have Y s _ = Y s almost everywhere 
with respect to the Lebesgue measure ds. 

For fixed t, the stopped process (M sAt ) s >o is a uniformly integrable (3^°)-martingale and 
by optional stopping M tA T n _ 1 = Mt/\T„ \ _i • Since £ n -i C D^y- _ x * n ^ s P roves the last 
assertion of the lemma. □ 
Proof of theorem 14.11 We continue the notation of the previous lemma and we consider, for 
t > and n > 1, 

M tATn - Mt/\T n _ 1 = 4 ATn ~ A Tn ^ ~ / n^Al fc -i (6) l Ya & ds. 
Since J b thTn = YJl=i ^T k <dY Tk =b we have J h tATn - J^ ATn _ 1 = lT n <dY Tn =b- Next 

rtAT n rt 
JtAT n -i JO 

= / 1 T„_i< S <T n 0(s-T rl _i,n^_ i )Al ft -i (6) ly Tn _ i ^ds, 
« 

since for T n _i < s < T n we have lis = (f>(s — T n _i,rij. _ ) and Y, = Yr n _i- Then from (|4.5p it 
follows that 



Pn(T n <t,Y Tn = b\V n - 1 )=E fl 



.« 



By standard arguments, for every t > we can choose a version of the conditional probability 
P[i(T n > 1 1 X n _i) in such a way that the function t — > -P^(T n > t | £ n -i) is nonincreasing and 
right-continuous P^-a.s. In particular, P^{T n > t j X n _i)(u;) is jointly measurable in (uj,t). An 
application of the Fubini theorem shows that 

Pfj,(T n < t,Y Tn = b | S n _i) = / P^{T n > s|E n _i) lT n _ 1 < s ^(s-r„_i,n^ n _ i )Al^-i( 6 ) ly Tn i ^ b ds. 

J 

(4-6) 

Now we sum over all b £ O. Denote a = Yr n _ 1 and v = (p(s — T n —i,Hj, ) for short. Then 
Ily i € A a and consequently v G A a since the flow leaves A a invariant. The identity 

"Al h -i (o ) + uAl h-i(b) = 0, 

beO,b=£a 

already noticed before, shows that 

X)^( s — r n _i, n^ w _ 1 )Al fc -i (6J ly Tn _ l9 £6 = ~^(« - T n-i> n T n _ 1 ) A1 ft- 1 (^r n _ 1 ) 
feeO 

and we arrive at 

P„(T n < 1 1 E n _i) = - j P^{T n > a|E n _i) 1t„_!< s ^(a - T n _i, ^JAl^i^j (is. (4.7) 

It follows that, .P^-a.s., the function t — > P/j.(T n > t | £ n _i) is absolutely continuous with deriva- 
tive 

j t P»{T n > 1 1 S n _i) = P M (T n > l Tn ^<t <f>(t - Tn^^JAl^Y^y 



22 



Together with the condition P/j,(T n > 0|£ n _i) = 1 this implies 

P^{T n > 1 1 E n _i) = exp ^ l^^^s 0(s - T n _i,n^_ i )Al ?i -i(y Tn _ i ) ds^ . (4. 



This equality will be used in three ways. First, substituting in (|4.6|) . we obtain a formula for 
the joint distribution of T n and Yr n conditional on S„_i: 

Py,{T n <t,Y Tn = b\ S n _i) = ^ exp ^ lr n _i<r 0(r - T n _i, n^JAl^-i^^) dr^j ■ 

■lT n ^<s(f>(s - T„,-„i,n^ i _ i )Al A -i (6) ly Tn _ l9 fe5 

(4-9) 

Second, (|4.8|) shows that the random variable T n (which may take the value oo) has the property 
that its conditional distribution with respect to £ n -l; restricted to [0, oo), posses a density d n 
with respect to the Lebesgue measure and d n can be computed by differentiating the right-hand 
side of 



(4.10) 



d n (t) = -exp ^ l Tn _ 1 < s 0(s -T n ^i,Il^ n _ i )Al h -i ( Y Tn _ 1 )ds ] j ■ 
■lr„_i<t <t>{t ~ Tn-u n^_ i )Al h -i ( y Tji _ i) . 

Third, (|4.3p can be easily deduced from (|4.8|) as follows: since 

P/j.(S n > t,T n -i < oo | S n -i) = -P M (T n > i + r n _i,T n _i < oo | £„-i) 

and since T n _i is E n _i-measurable we deduce from (14.8p that 



P M (5 n > t,T„_i < oo | 

2-"n— 1 J 

= lr„_i<ooexp ( y 1t„_i< s 0(s - T n _i, n^ JAl^-i^ ^ds 

= lT n _ 1<0 oexp ^ <^(s,n^ n ^Al^-i^ jdsj . 

It remains to prove (|4.4p . To this end we compute 

Ep [q (n^_, b) l Tn < t | E n _J = ^ [g L(T n - T„_ l5 n^^), ft) l T „< t | S n _ 



Noting that T n _i and 11^ are X n _i-measurable and recalling the expression of the conditional 
density (|4.10p we obtain 

E„[q (1% , b) lr n <t|S^i] 

= J q(j>(s -T n ^i,Jl^ n i ),bj d n (s)ds 

= ~ L q ^ S ~ Tn ~ 1,U Tn-l^ b ) ^(• S - T «~l' n T n _ 1 ) Al fe- 1 (^T n _ 1 )- 

■lT n -!<a exp (J lT n _ 1 <r^(r-r n _i,D^ n _ 1 )Al A -ipr aw _ i )dr^ (is. 

For a moment denote Yt u _ 1 by a and 4>(s — T n _i, II^, ) by za Then 11^, E A a , which implies 
^ € A a by the invariance of A a under the flow <p. If a = 6 then q (i/, 6) = 0. If a ^ 6 then 
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q{v,b) uAl h -i^ = —uAl h -ir b \ as noticed in (|4.2p . So we obtain 

E^[q {U^_,b) l Tn <t|£n~i] 

= lY Tn _ x ¥*> / <?K S - ^n-l.n^JAl^-i^)- 

/ s \ 

■lT n -i<» exp (J lT n -i<r 4>(r - T n -i,U%, n ^M^^^drj ds. 

Comparing with (|4.9p we conclude that 

P^T n <t,Y Tn =b\ E n _0 = P M [g (n^_, 6) l Tn < t | Xn-i] . 
Since is generated by £ ra _i and by T n , this immediately implies (|4.4p . □ 

4.1 An application to exit time distributions 

As an example of the kind of results that can follow from theorem 14.11 we prove an explicit 
formula for the law of the exit time of a finite Markov chain from a given set. 
We start from formula (14.31) written for n = 1: 



P„(Si > 1 1 <x(y )) = exp (7 n^)Al ft -i (yo) d^ , t > 0. 

Assume in addition that // = <5j is the measure concentrated at some i 6 J and let a = h(i) G O. 
We denote P^ by Pi and we note that Yq = a and I1q = Si, so we obtain 

Pi (Si > i) = exp ^(a.tfOAlfc-i^ds^ , t > 0. 

Finally assume that O = {a, 6} consists of exactly two points, and denote A = h~ 1 (a). Then Si 
coincides with the first exit time from A: 

t = inf{t > : X t <£ A}. (4.11) 

Note that y(t) := <f>(t,6i) is a solution of the differential equation 

y'it) = l fc -i(a) * - (l/(t)Al fc -i (o) ) y(t), i > 0, 

with initial condition y(0) = <5j. By proposition 12.11 there exists a unique global solution with 
values in A . So the only possibly nonzero components are y(t,j), for j € A, and the equation 
can be written in scalar form as 

(*>.?) = ^2v(t,k)\ kj - Yl y(^ k ) x kh y(t,j), jeA,t>o, (4.12) 

keA \k,heA J 

with initial conditions 

y(t,i) = l, y(t,j)=0, jeA,j^i. (4.13) 
We have finally proved the following: 
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Proposition 4.4 Suppose X is a time-homogeneous Markov chain in a finite set I with rate 
transition matrix A = {\ n m)n,m<^i ■ Let A be a proper subset of I, let Pi denote the law of the 
chain starting at i G A and let t be the first exit time from A as defined in {J^.ll ). Then we 
have 

rt 

Pi(r > t) = exp I / \ ] y(s, k)\ kh ds\ , t>0, 




where y(t,j), (j G A,t > 0) is the unique solution of \J^.13^ . 



Remark 4.5 Proposition 14.41 is a statement on the law of the exit time of a finite Markov chain 
from a given set. Since it is not directly related to filtering theory, but it rather concerns a 
basic topic in the theory of Markov chains, it may be possibly proved by different arguments. 
However we were not able to find a reference providing such an explicit formula. 

Remark 4.6 Proposition 14.41 follows from an application of formula f|4.3[) . Similar arguments 
based on (|4.4p provide the distribution of the Markov chain at the exit time from the set A. We 
omit the details. 



5 Filtering processes and piecewise-deterministic Markov pro- 
cesses 

The main purpose of this section is to show that the filtering process is a piecewise-deterministic 
Markov process (PDP) in the sense of Davis [9], [10] . and to present some consequences. To this 
end we first recall the definition of this class of processes. 

5.1 Piecewise-deterministic Markov processes (PDPs) 

We limit ourselves to the special case when the state space of the PDP is the effective simplex 
A e , since this is the only case we will deal with. Other differences from the general framework 
considered in [10] are pointed out in remark 15.11 below. We recall that A e is a disjoint union 
U a6 oA a where each A a is a compact subset of the euclidean space. We assume that we are also 
given the following objects. 

1. A flow 4> on A e . By this we mean a continuous function <p : IR + x A e — > A e such that 
(f>(t,(f)(s,x)) = 4>(t + s, x) for t, s > and x G A e and leaving each set A a invariant, i.e. 
(f>(t,x) G A a if x G A a . 

2. A jump rate function A : A e — > M + . We require that it is measurable and that for every 
x G A e there exists e > (depending on x) such that Jq \(4>(s, x))ds < oo. 

3. A transition measure Q on (A e , B(A e )), i.e. a stochastic kernel Q(x, A) defined for x G A e , 
A G B(A e ). We require that Q(x, {x}) = for every x G A e . 

A process (Z t )t>o, defined on some probability space (O,', J 7 ', P'), is called a PDP with respect 
to ((/>, X,Q), starting at z$ G A e , if there exists a sequence of nondecreasing random variables 
T n : £1' — > [0, oo] (re > 1), such that the following holds. 

(i) Zq = zq and the trajectories of (Zt) are cadlag functions, with discontinuities occurring 
precisely at times (T n ), P'-a.s. 
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(ii) Among jump times the process evolves deterministically along the flow; more precisely we 
have, for n > 1, P'-a.s., 

Z t = </>(t-T n _ 1 ,Z Tn _ 1 ) for r n _i<t<r n , (5.1) 

where we set T = 0. This implies that the random variables Zj- n ~ := limt->T»,t<T„ Zti 
defined on {T n < oo}, are also given by the formula 

Z Tn _ = 4>(T n - T„_i, Zt„_l) = ^(S'n, Zr n -i), on {T„ < oo}, rt > 1, 
where S n = T n — T n _i (S n are defined for n > 1 on the event {T n _i < oo}). 

(iii) For t > and A G B(A e ), 

P'(Ti > t) = exp f- jT A(0(s, 2%,)) 

P'(Z Tl eA|Ti) = g(| zl _,A) = Q(0(ri s ^),A), on{Ti<oo}, 
and for n > 2, 

P^S'n > 1 1 Ti,Z Tl , . . . jTn-ij^r^) = exp y A(^(s,Z Tn _ 1 )) ds^ on {T n _i < oo} 

P'(Z Tn G A | Ti , Z Tl , . . . , T n _i , 2r n _ a , T n ) = Q(Z Tn _ , A) 

= Q('/ , (5'n,2 , T n _i), A) on {T n < oo}. 

(5.3) 

Formulae (|5.2p - (|5.3p allow to interpret Q(x,A) as the probability to find the process in the 
set A C A e at a jump time, conditional to the fact that the process was in x £ A e immediately 
before the jump. They also explain the terminology jump rate for the function A. 

Formulae ()5. 2|) - f)5.3[) also show that ((f>, A, Q) and the starting point zq uniquely determine 
the finite-dimensional distributions of the stochastic process {Ti, Z^, T2, Zr x , • • •}• In view of 
(|5.ip we conclude that the law of (Z t ) is completely determined by (cj), A, Q) and Zq. 

Remark 5.1 The present definition differs from the one given in [10J for the following reasons. 

(i) In [TU] a specific probability space is chosen, namely the one consisting of a countable 
product of unit intervals equipped with the product Lebesgue measure. However the law 
of the constructed process is the same, and this is what matters in the following. 

(ii) In [10] the state space A e = U ae oA a is replaced by a general, finite or countable, union 
E = U V E U of open sets E u of the euclidean space. The fact that A a are compact does not 
affect the basic results. Similar slight differences are sometimes present in the literature, 
for instance in [12] the state space E is assumed to be closed; also compare the discussion 
in [10], at the beginning of section 24, on possible generalizations to cases where each E v 
may be a differentiable manifold. 

(iii) In [10], instead of the flow, a vector field is chosen as a starting point. This field is assumed 
to be locally Lipschitz and to generate a flow which is defined up to the time when it hits 
the boundary of E v , for every starting point in E v . In the specific situation of the filtering 
process which we are about to study we will also exhibit the vector field associated to the 
flow. 




26 



(iv) The main difference is the fact that in [TO] the trajectories of the PDP process (Zt) are 
required to jump at each time when they hit the boundary of some E v . Jumps at the 
boundary will not occur for the filtering process presented later. This difference does 
not affect the basic results we are going to use, and in fact it results in a simplification: 
for instance the delicate "boundary conditions" presented in [TU] in connection with a 
description of the extended infinitesimal generator of the PDP process are not needed. 
Again, similar differences are already present in the literature, for instance in |12j a jump 
occurs whenever the process hits some prescribed closed set r C E, not necessarily the 
boundary of E. 

5.2 The filtering process as a PDP 

Now we come back to the canonical set-up introduced in subsection 13.11 we fix v € A e and we 
construct the probability space (Q, J 70 , P v ). In the rest of this section all stochastic processes 
will be considered under P u , so that in particular v is the initial distribution of (X t ). Since 
v belongs to the effective simplex, as noted in remark [3.2i the filtering process (II^) starts at 
v, i.e. I1q = v. It is our purpose to show that (11^) is a PDP and to describe explicitly the 
corresponding triple (0, A, Q). 

1. As the flow <j) we take the flow introduced in subsection 12 . 21 after proposition ^. li We recall 
that it is the flow associated to the vector field F defined on A e = U ae oA a by the formula 

F{y) = l ft -i( a ) • (uA) - (yAl h -i {a) )v, v € A . 

2. As the jump rate function we take the function A : A e — > K + defined by 

AH = -i/Al ft -i (o) , v e A a . (5.4) 
It was shown in section 0] that A(V) > 0. 

3. The transition measure Q(v,A) is defined for v € A e and A 6 B(A e ) by 

Q(u,A) = Y,U(H b [uA])q(u,b), (5.5) 
beo 

where q(y, b) was introduced in section |4] (we still exclude the trivial case where h is 
constant). Thus, for v € A a , Q{y, ■) is the measure concentrated on the finite set 

{H b [uA} : b G 0\{a}} C A e \A a , 

and each point H b [vA] has mass q(u,b). More explicitly, for v £ A a , 

Q(u,{H b [uA}})= h (6) V„ 
-i/Al^-i( a ) 

provided vAl^-xi^ ^ 0. 

Remark 5.2 In the literature on PDPs the requirement that Q is a Feller kernel is often 
formulated among the assumptions: this means that for every bounded continuous g : A e — > R, 
the function v h-> j A g(p)Q(u,dp) is continuous (and necessarily bounded). This condition fails 
in general in our case, due to the occurrency of the exceptional set where v A\ h -\i a \ = (y E A a ) 
in the formulae above. However, in our case we have the following weaker form of the Feller 
property of Q. 
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Proposition 5.3 The function v h-> \{y) j A g(p)Q(is, dp) is continuous (and obviously bounded) 
on A e for every bounded continuous function g : A e — > M.. 

Proof. We start from formula (|4.2p : by the previous definitions it can be written, for v G A a 
and b ^ a: 

\(u)q(y,b) = i/Al h -i( 6) . 
Therefore, since q{y,a) = for u G A , we obtain 

\{y)Q(v, A) = U(H b [vA}) vAl h -i {b) , v G A a , A G B(A e ), 

beO,b^a 

and it follows that, for v G A , 




Now if z/ n — y u then for all large n we have v n G A a and we distinguish two cases. 
If ^Al/j-i( a ) ^ then for b ^ a 

H b [u n K\ = — — ih-i( ) * (y n A) ->■ — rr — i/i-ifo) * A ) = #f>[^ A L 

and the result follows from the continuity of g; 

If uAl h -ir a \ = —X(u) = then for all b ^ a we have z/ n Al ft -i( b ) — > uAl h -i^ = 0, which 
implies g(H b [v n A]) v n Al h -i^ — > by the boundedness of <?. □ 

We are now ready to present the main result of this section. 

Theorem 5.4 For every v € A e the filtering process (11^), defined in the probability space 
(£l,J-°,P u ) and taking values in A e , is a piecewise- deterministic Markov process with respect to 
the triple {(f), A, Q) defined above and with starting point v. 

Proof. Let (T n ) n >i be the sequence of jumps times of Y, with the convention that T n = oo for 
all n if no jump occurs and T n < T n+ \ = T n+ 2 = . . . = oo if precisely n jumps occur. We let 
To = and define S n = T n — T n _i for n > 1 on the event {T n _i < oo}. 

According to equation (|3.1j) . (IT/) can be explicitly described as follows: the starting point 
is IIq = v and for n > 1, 

W t = <t>{t - T n _i, W Tn i ) for T n _! < t < T n , 
W Tn _ = HS n ,W Tnl ), W Tn = H YTn [IT£_A]. 

It is clear that H u can jump only at times when Y jumps. We now claim that each T n (n > 1) is 
also a jump time of IF and therefore the jump times of IP and Y coincide. Indeed, if a denotes 
Yr„- = Yr n _ 1 and b denotes Yr n then a ^ b and since 11^ _ G A a , 11^ G A b , it follows that 
\V' r / IK, . 

In (14. ip we defined for n > 1 

S„-i = a(Y , Ti, y Tl , . . . , T n _i, 1t„_i), = a(Y ,T 1 ,Y Tl , . . • , T„_i, F r „_ 1; T n ). 

Note that Yo is constant P^-a.s. since if i/ G A a for some a G O then Yo = a. Also note that by 
the filtering equation (|3.ip Ti , Y^ , . . . , T n _i , Yp n _ 1 uniquely determine T\ , IK, , . . . , T n _i, Ily . 
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However, the converse is also true, since if Ily E A a for some a € O at a jump time < oo 
then = a. So we conclude that up to P^-null sets, for n > 1, 



J n- 1 



a(T x , r% , . . . , r n _x , n£ ) , = a(Tx , , . . . , r n _! , , T n ) , 



and in particular So is the trivial cr-algebra and £q~ = cr(Tx). 
Formula (|4.3p in theorem 14.11 shows that for t > and n > 1, 

P u {S n > t,T n _i < OO [ £ n -l) = exp ^ 0(5,11^ _ 1 )Al fe -l(y Tn _ i )ds^ lT n _!<oo- 

Denoting 1t„_i by a, we have Ily € A a and 4>(s,H^ n ) E A a by the invariance of A a 
under the flow. So by the definition of the jump rate function A (formula (|5.4p ) we have 
cj)(s, U^ n _ i )Al h -i(Y T ) = — A((/>(s, n^_ i )) and we conclude that on the event {T n _i < oo} we 
have ^ 

* I £ n _i) = exp y A(^(s,IIj Vi _ 1 ))dsV (5.6) 

This formula shows that the sojourn times 5 n have the required conditional distributions. 

Now we proceed to compute the conditional distributions of 11^ . We note that at each jump 
time T n < oo (n > 1) we have II^ = Hy Tn [IIj, _A] and therefore 

{n£ n eA,r B <oo}=|J {i? 6 [n^ n _A] e a, y r „ = 6, r n < oo}. 

feeO 

Since Ii v Tn _ = tp(S n , II^J on {T n < 00} the set {# 6 [ir£_A] E A, T n < 00} belongs to 
and therefore 

p„(n£ n e a, T n < 00 [ = 2 i Tn<00 i A (i?- 6 [n^_A]) p„(y T „ = 6,r n < oo | s+_ x )- 

The right-hand side can be computed using formula (14. 4p in theorem 14.11 and we obtain 

W n eAT„ < 00 !£+_!) = ^i Tn <ooU(^ 6 [n^_A]) g (n^_,6) = i Tn<00 Q(n^_,A), 

6eo 



where the last equality follows from the definition of the transition measure Q (formula (15. 5j) ). 

Together with (|5.6p . this shows that the properties (|5.2[) and ()5.3[) hold true and therefore 
(IT/) is a PDP with the prescribed jump rate function A and transition measure Q. □ 

At this point several properties of the filtering process might be stated as immediate con- 
sequences of general results on PDPs. For instance, an explicit description of the extended 
generator of its transition semigroup can be given in term of the triple ((f), A, Q): see [ID]. In 
section [6] below we will use known results on standard optimal stopping problems for PDPs in 
order to solve an optimal stopping problem with partial observation for the process X. In the 
present section we will exploit general knowledge on PDPs in order to prove that the filtering 
process has the Feller property. 



5.3 The Feller property of the filtering process 

We still use the canonical set-up of subsection 13.11 and for every v E A e we consider again the 
filtering process (IIJ') defined in the probability space (0,^° , P u ) and taking values in A e . It is 
a piecewise-deterministic Markov process with respect to the triple (</>, A, Q) defined above and 
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with starting point v. Here we wish to investigate further properties of its transition semigroup 
(Rt) introduced before proposition 13.41 We denote by C(A e ) (respectively, B(A e )) the space of 
real continuous (respectively, Borel measurable) functions on A e . Since A e is compact, we have 
C(A e ) C -B(A e ). It is clear that R t maps B(A e ) into B(A e ) for all t > 0. The fact that R t also 
maps C(A e ) into C(A e ) is known as Feller property and it is proved in the following proposition, 
together with a statement on continuity which shows that (Rt) is a strongly continuous semigroup 
on the space C(A e ) equipped with the supremum norm. 

Proposition 5.5 For every f G C(A e ) we have 

Rtf G C(A e ), t > 0, 

and Rtf f uniformly on A e as t — > 0. 

Proof. Let us define an operator G acting on continuous and bounded functions ip : R+ x A e — > 
K by the formula 

Gil,(t^) = f(<t>(t,v))e- M ^+ f [ iP(t-s, P )Q(<j>(s,v),dp)X(<l>(s,v))e- M ^ds, 

JO JA e 

for t > 0, v G A e , where M(t, v) = J * \(4>(s, v)) ds. Let G n denote the n-th iterate of G. Since 
the function v i->- f A g(p)Q(v,dp) X(v) is continuous for every g G C(A e ) by proposition 15.31 
therefore Gip (and hence G n ip) is continuous and bounded on M + x A e . It is proved in |10j . 
proof of Theorem 27.6, that for every fixed ip we have G n ifj(t, v) — > R t f(v) uniformly in v € A e 
for all t > as n — > oo, so we immediately conclude that Rtf S C(A e ) for every t > 0. 
To prove the final statement of the proposition note that 

R t f(v) = E u [f(I%)l t<Tl ]+E v [f(T%)lt> Tl \ 

= f(4>(t, u))P v (t < Ti) + E u [fm)lt> Tl ], 

so that 

\R t f(v) - /H| = \f(</>(t, v)) - f(v) - f(cf>(t, v))P v (t > T X ) + E v [f(W t )l t > Tl \\ 
< \fW, v)) - /H| + 2(sup |/|) P u (t > T X ). 

By the properties of the flow we have \f((j)(t,v)) — f(v)\ — > as t — > 0, uniformly in v G A e . 
Recalling formula (15.2P for the distribution of 7\ and denoting by A an upper bound for the 
jump rate function A we have 

P V (T\ > t) = 1 - exp (- J X((f)(s, v)) ds) < 1 - exp (-t\) , 

which shows that P U (T\ > t) — > as t — t 0, uniformly in G A e . □ 

5.4 A canonical version of the PDP filtering process 

The introduction of a canonical version is useful for applications and will be used in section 
We follow the notation of [lUj . section 25. Proposition 15.61 is the only new result in this 
subsection, which will be applied in section [6j 

1. Let Cl be the set of cadlag functions Q : M + — > A e . We denote IL(w) = u)(t) for wgO and 
t > 0, and we introduce the cr-algebras 

jr° = a(fl s : s G [0,t]), -F° = a(TL s : s > 0). 

(^°)i>o is thus the natural filtration of (f[ t )t>Q- 
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2. For every v G A e , we denote by P v the law of the process (11^) defined in (0, P°, P u ). Thus, 
P v is the probability measure on (Q,P°) such that P,(r) = Pv{w £ ^ : ^(w) G T) and 
v is the starting point of (lit) under P v . This definition is meaningful since the trajectory 
t i y Ht(uj), denoted IF^), belongs to Vt, for P,-almost all oj and the map u h-> IF(u;) is 
measurable from (0, J 70 ) to (fi,P ). 

For every Borel probability measure Q on A e , we define a probability Pq on (Q,P°) by 
Pq(T) = J Ae P„(T) for r G P°. Thus, Q is the initial distribution of (fl t ) under Pq. 

3. We denote by P^ the Pg-completion of P° and we assume that Pq is extended to in 
the natural way. We denote by the family of elements of P^ with zero Pg-probability 
and we define 

p«=a(p°,AA Q ), p 4 =n^ Q 5 *>°> 

Q 

where the intersection is taken over all Borel probability measures Q on A e . 

(-P)t>o is called the natural completed filtration of (lit). It is right-continuous, i.e. for all 
t > we have p = J r t+ := n e >oPi+ e : see [ID] , theorem 25.3. 

We will refer to the PDP on A e constructed above as the 5-tuple (H,P, (fLj 4 >o, (P I/ ) I/6 A e) 

Now suppose that /x G A and consider the probability space (fi, P° , P^) and the filtering 
process (11^) defined by (13. ip . Since does not belong to A e in general, the law of (II^ 1 ) is not 
immediately known. 

Proposition 5.6 For every \i G A £/ie iaw; of (II^ 1 ) under P M is Pq, where Q is the Borel 
probability measure on A e concentrated at points H a [fj] G A e (a £ O) such that 

Q({P a [/i]}) =M^ 1 (a)), «eO. 

Proof. Define p G A setting p = X^aeO ( a )) H a \pi\. Since p is a convex combination of 

probabilities H a [fj] in A e , the law of (II^) under P p is X^aeo ^Q 1 ( a )) ^H a [p] = Pq- So to 
finish the proof it is enough to show that the laws of (II^ 1 ) under P^ and P p are the same. The 
filtering equation (|3.1f) and the definition of the observation process Yt = h(Xt) imply that the 
trajectories II M (w) of the filtering process are a deterministic functional of the corresponding 
trajectories X.(oS) of the process X. So it is enough to show that P p and P p coincide on the 
canonical space (0,P°). Since the generator A is fixed, it only remains to check that the law of 
Xq is the same under P p and P„. This is, however, immediate, since recalling the definition of 
the operator H in subsection 12.31 we verify that if i G I and h(i) = b G O then we have 

P P (X = i) = p(h-\b)) H b \p\{i) = = P P (X = i). □ 
6 Optimal stopping with partial observation 

We assume that /, A, h are given as in the previous paragraphs and we consider again the 
canonical set-up introduced in subsection 13.11 Thus (X t ) is the canonical coordinate process 
in the space Q of cadlag functions uj : W + — > I, (Jf) is the natural filtration and P° is the 
cr-algebra generated by X. For fj, G A, P M denotes the probability on (f2,P°) that makes (X t ) a 
Markov process on / with generator A and initial distribution \i. The observation process and 
its natural filtration are still denoted (Y t ) and (3^) respectively. 
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For fj, G A we denote by J-^ the P^-completion of J 70 and we assume that P^ is extended to 
J 7 ^ in the natural way. We denote by the family of elements of J 7 ^ with zero P^-probability 
and we define 

= o{y*,N»), t>o. 

CVf)t>0 is called the natural completed filtration of (Yt). As a consequence of the fact that (Yt) 
has piecewise-constant, right-continuous trajectories under P„, with a finite number of jumps 
in every bounded interval a.s., it can be proved that (3^f ) is right-continuous, i.e. for all £ > 
we have = y£, := n e >o3^t+ e : see [5], Appendix A2, or [ID], Appendix A2. Consequently the 
filtered probability space (Cl, , (y^),P^) satisfies the usual conditions. 

We denote by the class of functions r : Cl — > [0, oo] that are stopping times with respect 
to 0$). 

In addition to I, A, h, the stopping problem with partial observation is defined by a pair 
of functions g, I : I — >• R, called stopping cost and running cost respectively, and a real number 
a > 0, called discount factor. This terminology is justified by the introduction of the following 
cost functional: 



J(H,t) = 



e- aT g(X T )+ [ T e~ as l(X s ) ds 
Jo 



H G A, r G T\ 



that one tries to minimize with respect to r € T^. Here we adopt the convention that 
e~ aT g(X T ) = if r = oo; similar conventions will be tacitly used in the following. The corre- 
sponding value function is defined by 

V([i) = mf^J(fx,r), fie A. 

A stopping time r* ,M G T M is called optimal (relatively to fx G A) if V(r*^) = J(/i,r* ,M ). 
The optimal stopping problem consists in finding characterizations of V and giving conditions 
ensuring the existence of an optimal stopping time and a description of it, for all \i G A. 

Remark 6.1 In the definition of J and V we could replace the class of stopping times 
relatively to (yf 1 ) by the class of stopping times relatively to the natural, uncompleted filtration 
(y^ ). However, the former is much larger and, due to the fact that (3^ M ) satisfies the usual 
conditions, it includes many interesting random times (for instance the first entry time of a 
cadlag adapted process in a Borel set). For this reason we have chosen the formulation above. 



For every /i G A we still denote by (n^ ) the filtering process defined by equation (13.1 
Lemma 6.2 For every fi, G A and r G we have 



J([x,t)=E i 



e- ar U^g+ / e~ as W s lds 



o 



Proof. This is a direct consequence of the properties of the optional projections of (e~ at g(Xt)) 
and (e~ at l(X t )), but it can also be easily proved by elementary considerations as follows. 

Assume first that r has a finite number of values. Excluding the trivial case of t being 
constant, there exist an integer n > 1, numbers < t% < . . . < t n -\ < t n = oo and disjoint sets 
Ai C Cl (1 < i < n) such that U? =1 Ai = Cl, A x G y£ for 1 < i < n and r = EiLi^W Since 
= P^Xt = i\y?) we have E^[g(X u ) l Ai ) = E^a l Ai ] and it follows that 



n— 1 n— 1 



=i 



32 



Given a general r G T 7 ^, let r n be a nonincreasing sequence of (J r ^)-stopping times such 
that t n — y t and each r n has a finite number of values. In the equality E IJi [e~ aTn g(X Tn )] = 
E^[e~ aTn Il^ n g\ we let n — y oo. We note that X Tn -> X r , II^ B -> IT? by the right-continuity of X 
and IP, and we conclude that ^[e"^^)] = E^e'^U^g}. 

The equality i 7 ^ J Q T e~ as l(X s ) ds = e~ as Hsl ds can be proved in a similar way. □ 

Let again (ILt)t>o, (Pv)i>eA e , (Ft)t>a) be the PDP on A e defined in section EU We 

first need a technical result. 

Lemma 6.3 For every fj, G A and r € T^, t/iere exists an -stopping time f such that 

t(oj) = r(IP(u;)) /or P^-almost all uj G fi. f is a£so an (Ft) -stopping time. 

Proof. The last assertion of the lemma is obvious, since C Tt and [Tt] is a right-continuous 
filtration. 

Let us also note that the mapping IP : Q — >• O, oj i— > rP(w) is measurable with respect to J 70 
and JT°: it is enough to note that for every t the composition IT ° IP : Vt — > A e is measurable 
with respect to J 70 and B(A e ) since (n t o II) (w) = IL"(u;). 

The rest of the proof consists in the construction of f and will be divided into several steps. 
Step 1: we show that 3f := a(y^,J\f >1 ) coincides with U := {A C : 3B G 3^°, AAB G 

The argument is standard. The inclusion % C 3^ is easy and the inclusion y£ C % is proved 
by verifying that H is a cr-algebra containing 3j° and 7V^. 
Step 2: for every t > we have 3t° = (IP) -1 ^). 

It is generated by the family of sets of the form 

B = {uj G n : w(ai) G Ai, . . . ,u>(s„) G A n }, (6.1) 

for some integer n and some < si < . . . < s n < t, Ai G B(A e ) (1 < z < n). Then (LT^) - 1 (i?) 
is the set 

{a; G : n£» G A\, . . . ,II£» G AJ, (6.2) 

and this proves that (IP) -1 (J?) C 3f- 

Conversely, 3? is generated by sets of the form (16. 2|) . and each such set belongs to (IP) -1 ^ ) 
since it has the form (IP) -1 (.B) for B given as in (|6.ip . So we deduce that also 3? C (IP) _1 ( "f^). 
Step 3: we prove the result assuming that r G 7"^ has a finite number of values. 

Excluding the trivial case where t is constant, there exist an integer n > 1, numbers < t\ < 
• • • < tn-l < t n = oo and disjoint sets Ai C $7 (1 < ? < n) such that U" =1 A = tt, Ai G 3^ for 
1 < i < n and r(w) = £)™ =1 UI Ai (uj). By Step 1 there exist N { G 7V^ and G 3 ? ° (1 < i < n) 
such that AA5j = iVi. By Step 2 there exist Cj G j£ (1 < i < n) such that (IP)" 1 ^) = B, L . 
Let us define r : fj — >• [0, oo] setting 

f(u) = ti, u G Ci, 

f (cj) = ti, a) G Ci\(Ci U . . . U Cj_i), 1 < i < n, 
f (a)) = oo, a) ^ C\ U . . . U C n -i- 

f is clearly a (^°)-stopping time. To finish the proof of Step 3 it is enough to show that 

r(w) = f (IP(w)), w g JVi U . . . U iV n _i. (6.3) 

Let us fix oj <£ Ni U . . . U iV n _i. First suppose that IP(w) G Ci\(Ci U . . . U C,_i) for some 
1 < i < n. Then in particular IP (a;) G C, and therefore uj G -Bj and since uj ^ Ni it follows that 
uj £ Ai and t(w) = tj = f (IP(oj)). In the case 11^ (w) G C\ the argument is similar. Finally if 
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rP(w) £ C\ U . . . U C n „i then for 1 < i < n we have oj ^ B{ and since cj ^ iVj it follows that 
w ^ Aj, so we deduce that t(uj) = 00 = f (IP(u>)). Now (j6.3|) is proved. 
Step 4: conclusion. 

Given a general r € T M , let r n be a nonincreasing sequence of (T£ )-stopping times such that 
r n — > t and each r n has a finite number of values. Let f n be constructed starting from r n as in 
Step 3, so that P^-a.s. we have r n = f n (IP) for all n. Let us define f(u)) = liminf ra _ i . 00 f n (cD) 
for all to G and let .A C be the set where lin^^oo f n exists (finite or infinite) . Then f is 
a (.F t + )-stopping time. Moreover for P^-almost all ui G we have r n (IIf t (a;)) = r n (w) — > t(uj) 
which shows that rL"(u;) G A and f (rP(w)) = t(uj). □ 

Lemma 6.4 For every /i G A and r G 7^, /ei r denote an (Ft) -stopping time such that 
t(oj) = f(IL"(w)) for P^-almost all u G fi. T/ien we Ziaue 



J(h,t) = E 1 



Q 



F U f g + 



tu 



where Q is the probability measure on A e concentrated at points H a \p] (a £ O) such that 

Q({H a [^]}) = ^h- 1 (a)), a GO. 
Proof. Define a real function $ on by 

= e- a ^fl f{G]) (uj)g + I ^ e- as n s (u>)l ds, wgO. 
jo 

It can be checked that $ is bounded and ^"-measurable. Taking Co = rP(u;) we obtain IL s (Cj) = 
u)(s) = Us(uj) and for f(a)) < 00 we have tl f ^(Cj) = u)(f(a))) = LT^^a;). Since f(Co) = t(uj) 
P^-a.s. we conclude that 



From lemma [6721 and proposition 15.61 we have 



a.s. 



J(/i,r) 



$(w) P Q {dCo) = E Q 



P fi f g+ f T e- as tl, 
Jo 



I ds 



□ 



We now formulate an auxiliary optimal stopping problem with complete observation for the 
PDP (P u )ueA e , (R t )t>o, (Ft)t>o)- We denote by T the class of (^)-stopping times defined 

on 0, and we define the cost functional 



J(u,f) 

and the value function 



e- af Ii f g+ / e- as U s lds 



t G T, v G A e , 



v(v) = inf J(v,t), v G A e . 



Note that in this formulation the class of stopping times T does not depend on v, but it is 
sufficiently rich since it satisfies the usual conditions: compare remark \6. 11 

The proof of the following result can be found in [3j, Chapter VII, Theorem 6.1. We do not 
repeat the assumptions of this theorem but we note that they are all trivially verified in our 
situation; the only nontrivial verifications are the properties of the semigroup (Rt) which were 
proved above in proposition 15.51 
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Theorem 6.5 The value function v is continuous. The random time defined by 

f* = M{t > : Il t g = v(U t )} 
belongs to T and it is optimal relatively to every v 6 A e : 

V (y) = J(u,f*), V £ A e . 

We are now ready to present the main result of this section. 

Theorem 6.6 For ji € A, let (II^) be the filtering process defined by equation \3. 1}) . Define the 
random time 

t*>» = inf{t > : II%g = v(U^)}. 

Then t*'^ belongs to and it is optimal relatively to /U. Moreover the value function is given 
by the formula 

V(n) = Y J Kh- 1 (a))v(H a [v}), fj, E A. 

aeO 

Remark 6.7 (i) The notation emphazises that r* ,M depends on (i, since (IL") does. 

(ii) If v G A e then we have V{v) = v{v). Indeed if v € Af, for some b € O then for all a £ O 
we have v(h~ 1 (a)) = l a =b and Hb[v] = v. 

(iii) t*'^ is the first entry time of IP in the contact set {y € A e : vg = v{v) = V{v)}. 

Proof. Let \x € A be fixed. For arbitrary r € T^, by lemma [673~1 there exists an (^)-stopping 
time f such that r = f (IP) P^-a.s. By lemma [6741 and the definitions of Pq and J, 

e- af U f g+ [ e' as tl s lds 
Jo 

aeO 

= Y^^h-\a))J{H a [^f). 

An application of theorem 16.51 shows that 

JM >Y,^h' 1 {a))J{H a [ l jLlf*) = Y,^h-\a))v{HM). (6.5) 

aeO aeO 

Now consider r*'^ and note that it is an element of by the continuity of v. Moreover we 
have T* ,fl = f*(IP) P^-a.s. and so, arguing as in (I6.4p we obtain 

J{^r*^ = Y,Kh-\a))J{HM,f*). 

aeO 

Applying again theorem 16.51 we have 

J( M ,r^) = ^/,(/ l - 1 (a))t;( J f/ a M). 

aeO 

Comparing with (|6.5p we conclude that r* ,At is optimal relatively to and that the right-hand 
side of the last formula equals V(fi). □ 
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JO,t) = E Q 



F U f g + 



tu ds 



(6.4) 



A satisfactory solution of the partially observed optimal stopping problem should also in- 
clude a characterization of its value function V. The last assertion of theorem 16.61 shows that this 
problem is reduced to finding characterizations of the value function v of the optimal stopping 
problem for the PDP. Since this is a fully observable problem many results on analytical char- 
acterizations of v are known, mostly in the form of obstacle problems. For instance in Theorem 
6.1 of Chapter VII of [3], mentioned above as theorem 16.51 it is also proved that v satisfies the 
system of inequalities on A e 



where (Rt) is the transition semigroup of the PDP introduced before, the obstacle ip is simply 
ip{v) = vg and the function L is L{v) = ul (y € A e ), and moreover u is the maximum element 
among all real continuous functions on A e satisfying (|6.6p . 

In the specific case of optimal stopping for PDPs many other analytical characterizations of 
v can be found in the literature, (although sometimes under assumptions slightly different from 
ours): see for instance [12], [13] or the monograph |10| and the references therein. 
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